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THE APPLICABILITY AND EFFECTIVENESS OF CLUSTER ANALYSIS 

By D. S. Ingram, IBM, and 
A. L. Actkinson, Mathematical Physics Branch 


1.0 SUMMARY 


The objective of this internal note is to provide insight into the character- 
istics which determine the performance of a clustering algorithm. It demonstrates 
that, in order for the techniques which are examined to accurately cluster data, 
two conditions must be simultaneously satisfied. The first condition is that the 
data must have a particular structure, and the second is that the parameters chosen 
for the clustering algorithm must be correct. By examining the btructure of the 
data from the Cl flight line, it is clear that there is no single set of parameters 
that can be used to accurately cluster all the different crops. The effectiveness 
of either a noniterative or iterative clustering algorithm to accurately cluster 
data representative of the Cl flight line is questionable. This means that, in 
order to use cluster analysis in its present form for applications like assisting 
in the definition of field boundaries and evaluating the homogeneity of a field, 
one must have extensive a priori knowledge. Modifications to existing techniques, 
or entirely new techniques, are necessary ^or clustering to be a reliable tool for 
representative data sets. 

A modification to existing clustering methods is proposed. This involves 
the use of goodness of fit tests to determine, in a quantitative manner, a measure 
of the unimodality of a cluster. This also has applications to quantitatively 
evaluating the homogeneity of test and training fields. 


2.0 INTRODUCTION 


Cluster analysis is a decision-making process in which similar measurements 
are grouped together. The primary advantage of cluster analysis is that it is 
not necessary to assume a statistical model for the data. Typical applications 
which have been identified are evaluating field homogeneity, boundary definition, 
selecting homogeneous data from nonhomogeneous data, and use as an unsupervised 
classifier. An objective of this internal note is to determine the factors which 
affect the ability of a clustering algorithm to perform these functions. These 
factors are examined in view of the data analysis requirements associated with 
processing multispectral scanner data for agricultural crops from the Cl flight 
line. 


For the clustering algorithms (ref. l) which are examined to accurately cluster 
data, two conditions must be simultaneously satisfied. First, the dat*\ must have a 
particular structure, and, second, the correct parameters must be used in the clus- 
tering algorithm. To demonstrate these conditions some experiments using two sets 
of simulated data are described. The structure of the first set of data is such 
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that the clustering algorithm will accurately cluster the data. The statistics 
of the second set of data are determined from the Cl flight line. These experi- 
ments indicate that the results obtained by using existing cluster analysis tech- 
niques to evaluate field homogeneity, boundary definition, selecting homogeneous 
data from nonhomogeneous data, and as an unsupervised classifier are likely to have 
little meaning unless one essentially knows the answer before the data are processed. 
This is particularly significant because it means that it would be necessary to de- 
termine the parameters to be vised in the algorithm for each flight line and for 
each different set of crops. 

The key question which must be answered to make clustering a scientific tech- 
nique rather than an art is whether a set of data is unimodal or multimodal. It 
is proposed that two goodness of fit tests be investigated in order to quantify 
the concepts of unimodality and homogeneity. This would be very valuable in de- 
termining the appropriateness of the assumptions of the probability density func- 
tion of the multivariate data. The assumption of the multivariate normal distri- 
bution is used extensively in feature selection and pattern classification. 


3.0 ANALYSIS 


In this section clustering is initially described from an intuitive point of 
view. The relationship between the form of the data and the result produced by a 
clustering algorithm is investigated for some limiting cases. The results obtained 
by processing data that corresponds to an agricultural image are presented for two 
sets of simulated observations. The first set is chosen such that the algorithm 
can produce accurate results. The statistics of the second set of data were de- 
termined from the Cl flight line. Both a noniterative and an iterative algorithm 
are used to process the data, which are representative of the Cl flight line. 

As is demonstrated , the structure of the data corresponding to the agricultural 
crops on the Cl flight line is such that no single set of parameters can be used to 
accurately cluster the data. The cluster results is dependent on the parameters 
used in the algorithm. Hence, any measure of field homogeneity is input-parameter- 
dependent. The fundamental problem, then, is to determine whether or not a set of 
data is unimodal. 


3.1 Cluster Analysis 

Cluster analysis is a decision-making process in which similar measurements 
are grouped together. The performance of an algorithm to group data together de- 
pends on the structure of the data. To illustrate this condition, consider two 
sets of two dimensional data. In figure l(a) the data are uniformly spaced in the 
x^, coordinates and in figure l(b) the data are neatly grouped into three distinct 

subsets. 

In order to apply a cluster algorithm, a function must be used which determines 
whether two observations are similar. For the sake of illustration let the similarity 
function be a distance measure. If a measurement is within a specified radius of a 
cluster mean, then that measurement is an element of the specified cluster. The 
specified radius is a parameter of the algorithm. Consider the relationship be- 
tween the number of clusters and the radius, R, for the data in figure l(a). If 
R is very small, then the number of clusters, N, equals the number of data points, 
and if R is very large, then there is one cluster. As R changes from very small 
to very large the structure of the N versus R cu~ would be similar to the graph 
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shown in figure 2(a) for the data of figure 1(a). The same procedure can be carried 
out for the data of figure l(b). The limiting cases are the same; however, the num- 
ber of clusters is constant over a range of R. Since the form of the data is known, 
it is clear that the correct number of clusters is three and that a value of R such 
that < R < is acceptable. This is equivalent to finding the set of parameters 

that produce the correct answer or to training the algorithm. 

This example clearly shows that, in order for a clustering algorithm to effec- 
tively cluster data, two conditions must oe simultaneously satisfied. The first 
condition is that the structure of the data must be such that the data can be clus- 
tered. If this condition is satisfied then it is necessary to choose the correct 
parameter in the algorithm. A value of R outside the rauge of R^ < R < R^ 
would not cluster the data correctly. 


3 . 2 Simulated Data 

The concept of clustering data from an image is further developed by consider- 
ing two sets of simulated data. The clustering algorithms used include both a one- 
pass and an iterative technique. The spatial configuration of the classes in the 
image is similar to an agricultural scene. The first set of data is such that the 
clustering algorithm wi31 effectively cluster the data. The second set of data is 
representative of the Cl flight line. Neither the noniterative nor the iterative 
algorithm x s effective on the simulated Cl data. 

3-2.1 Ideal case .- The simulated data generated for this case are described 
in reference 2. The noniterative clustering algorithm used is the CLUST1 option in 
ASTEP (ref. l). Figure 3 illustrates the way the image is subdivided. For this 
example only two channels of data, 11 and 12 of reference 2, are processed. The 
mean ±1 standard deviation for each class are plotted in figure U. Each element 
of the field is generated from a normal distribution with a mean of and standard 

deviation of for i =1,5 for each channel. The data are uncorrelated from 

channel to channel. 

The data were clustered for several values of R with the condition that 
C = 2R. Although it is not obvious that the condition C = 2R will yield "best" 

results, this condition does appear to be a reasonable way of relating C and R. 

In each case the initial value for the maximum number of clusters was 2C and the 
initial values for the means of those clusters was 0. The results of the plot of 
the number of clusters versus R is shown in figure 5. The number of clusters is 
constant f ;>r values of 10 < R < 30 and at each value of P the clusters are the 

same. The image map displayed by ASTEP is shown in figure 6 for R = 20. Each 

observation is classified correctly. This is exactly the result one would expect 
for the structure of the data in figure 

The question which one must ask is how to know that each of the five clusters 
is unimodal. For values of 32 < R < 50 there are three clusters. The clusters 
are not the same three clusters for all values of R. However, for R = 32, 3*, 
and 36 the three clusters are the same and one might suspect that there are three 
clusters instead of five. 
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3*2*2 Simulated Cl data .- The statist ies of the crops along the Cl flight 
line are listed in table 1* The mean ±1 standard deviation for each class in 
channels 6, 10 and 12 is plotted in figures 7 and 8. The statistics listed in 
table I were used to generate a data tape which represents the fields in figure 3. 

The ellipses which are drawn in figures 7 and 8 have their principal axes parallel 
to the measurement coordinates. This is not the case for Cl data, as the principal 
axes of each of the ellipses would be inclined to the measurement coordinates. Ex- 
amining figures 7 and 8 it is clear that the structure of the data is such it would 
be difficult to find a set of parameters that could cluster all the crops. The 
most obvious reason for this is the size of the standard deviation of Wheat 2 as 
compared to the difference between the means of the other crops, such as Alfalfa. 

The noniterative algorithm described in reference 1 was used to process these 
data with the same set of initial conditiors described in section 3.2.1. The re- 
sults of N versus R is shown in figure 9. There is clearly no well-defined 
interval for which there are eight clusters. For this set of conditions the best 
results appear to occur for R » 5 as shown in figure 10. It is possible to de- 
cide what is best only because we know the answer. 

Soybeans and Bare Soil are accurately classified. Cornl and Oats are classi- 
fied with fair accuracy. It is not possible to distinguish among Red Clover, Red 
Clover2, Alfalfa, and Wheat 2. For this case the Red Clover field appears to be 
nonhomogeneous while the Cornl field appears to be homogeneous. As another illus- 
tration of the concept of homogeneity consider figure 11. In this case R * 16 
and the image divides nicely into two categories, B and C. The field labeled C 
appears to be homogeneous and indeed is Wheat2. The field labeled B appears to 
be homogeneous and indeed consists of seve different crops . 

The same data were processed with ISODATA (ref. U) using the same parameters 
suggested in reference 5* namely DLMIN and STDMAX equal to 3.2 and U.5, respectively. 
For the best case (fig. 12), Red Clover and Alfalfa were indistinguishable and the 
accuracy of the classification of Oats and Com is fair. This case took 20 itera- 
tions and used in NMIN value of 30. The term NMIN is the minimum number of points 
allowed in a cluster. Changing the NMIN value to 15 resulted in Wheat, Red Clover, 

Red Clover2, and Alfalfa being poorly classified while the accuracies of Com and 
Oat classification improved somewhat (fig. 13). The cases illustrate that the 
choice of NMIN affects the accuracy of clustering. Changing NMIN may cause some ac- 
curacies to improve while others deteriorate. 

Chaining was applied in each of the above cases. The results without chaining 
were much worse (figs. lU and 15 for NMIN equal to 30 and 15, respectively). 

Using a different channel set, channels 1, 6, 9, and 12, the ISODATA classifi- 
cation maps were figures 16, 17* and 18* after 18, 19, and 20 iterations, respectively. 
The value used for NMIN was 30, and no chaining was used. The 20 iterations case 
(fig. 18) was less accurate than the corresponding three channel case given in 
figure ih. Also, the results for iteration l8 were much better than those for itera- 
tion 19. 

These results demonstrate that the ISODATA classification is dependent on the 
number of iterations, the number and choice of channels, and on the choice of NMIN. 

No criteria currently exist for selecting these values without extensive a priori 
knowledge. Even for the best choice, the accuracies were very poor for some crops. 

The effectiveness of the iterative algorithm to cluster data representative of the 
Cl flight line is questionable. 
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U.O RECOMMENDATION 


In order to detemine whether a set of data consists of one or more clusters 
it is necessary to determine whether the data set is unimodal or multimodal. This 
can be determined in a quantitative manner by using goodness of fit tests. The 
two tests considered here are the classical chi-squared test and the Kolmogorov- 
Smirnov test. These could he applied to each potential cluster and the degree to 
which the cluster is unimodal could be determined. 

A chi- squared random variable is defined as the sum of squares of independent 
standard normal variables (ref* 6). Let X be normally distributed with zero 
mean and unit variance; then the chi-squared random variable is 

Q - Xi + X| + ... + X2 (1) 


where there are k independent values of X. The parameter k represents the 
number of degrees of freedom of the system. The chi-squared random variable can 
be related to the multivariate normal distribution by noticing that the quadratic 
form in the multivariate normal p.d.f. is a chi-squared random variable, that is, 

pW ■ - j- 5 - “>T'‘ u - u) | 121 

and 

Q - (x - p) T y*- 1 (x - u) (3) 


where x is the n x 1 random variable, y is the mean vector and I is the 
n x n covariance matrix of the multivariate normal distribution. The chi-squared 
variable in equation (3) has n degrees of freedom. 

The probability density function of a chi-squared random variable Q is 

p(Q Q U-2)/2 e -Q/2 Q>0 (U) 

2 n/2 r(n/2) 


where a is the number of degrees of freedom and T is the gamma function. The 
cumulative distribution function (c.d.f.) of Q is 


* \ p(Q) dQ 
T) 


P(Q) 


(5) 
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B^uation ( 5 ) can be evaluated In closed form when n is even. The results for 
n * 2, and 6 are 


n « 2 

P(Q) - 1 - e - ^ 2 

(6) 

n a U 

P(Q) » 1 - e _Q ^ 2 (l + Q/2) 

(7) 

n * 6 

P(Q) - 1 - e“ Q/2 [l + | + § 2 

(8) 


The number of degrees of freedom is the same as the number of independent channels 
of a multi spectral scanner. 

If a data set has a multivariate normal distribution, then the numerically 
constructed p.d.f. and c.d.f. should match the functions generated by equation (U) 
and equation (5)» respectively. The question of how well one function matches 
another introduces the concept of goodness of fit. Two goodness of fit techniques 
are developed, one related to the p.d.f. and the other related to the c.d.f. The 
advantages of the goodness of fit techniques is that it is possible to establish 
the percentile level of the fit. 

In 1900 Pearson introduced the following measure (ref. 6), which is large 
when the differences (f ^ - f ci ) are large. 



where f ^ is the ith observed frequency of occurrence, f ^ is the ith expected 

or computed frequency, K is the number of measurements . It has been shown that 
X 2 is a chi-squared variable with k - 1 degrees of freedom. Hence, one could 
compute the frequency distribution of Q and evaluate x 2 for K intervals along 
the distribution and determine the percentile level from a table of percentiles of 
the chi-square distribution. In the case of an application to multispectral scanner 
data, the number of independent channels would determine the number of degrees of 
freedom to generate the p.d.f. given by equation (U), which is related to the com- 
puted frequency, f^. Given the measurements the observed frequency could be con- 
structed. Then K values along the x 2 axis could be chosen and equation (9) 
evaluated. The use of a table of percentiles of the chi-square distribution 
would determine the accuracy to which the data base follows the chi-squared 
assumption. 

A method of determining the goodness of fit baaed on the distribution function 
uses the Kolmogorov-Smirnov statistic. If P(x) is the theoretically constructed 
cumulative distribution function and P Q (x) is the numerically constructed cumula- 
tive distribution function then the Kolmogorov-Smirnov statistic is 
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D » max|p(x) - P Q (x)| (10) 

all x 


This situation is illustrated in figure 19- The value of D and the number of 
samples determined the accuracy to which P c (x) approximates P(x). In the case 

of multispectral scanner data the number of independent channels determines the 
value of n to be used in equation (5). P c (x) would be generated from the ex- 
perimental data. The value of equation (10) and the number of samples would be 
used as inputs to a table of acceptance limits for the Kolmogorov-Smirnov test of 
goodness of fit. 

The effectiveness of the chi-squared statistic and the goodness of fit tests 
should be evaluated by using synthetic data. This is an effective procedure for 
checking out the implementation and pi *er of the algorithms. Synthetic data which 
are representative of aircraft and spacecraft data should be generated and analyzed. 
This will provide insight into the applicability of the statistical tests for dif- 
ferent data bases. 

Actual remotely sensed data from aircraft and spacecraft should be processed 
to obtain better insight into the characteristics of the data base. The topics 
of field homogeneity, feature selection, pattern classification, and error analysis 
should be investigated in terms of the characteristics of the data base. 


5.0 CONCLUSION 


This internal note has demonstrated that, in order for the techniques which 
are examined to accurately cluster data, two conditions must be s multaneously 
satisfied. The first condition is that the data must have a particular structure, 
and the second is that the parameters chosen for the clustering algorithm must be 
correct. By examining the structure of the data from the Cl flignt line, it is 
clear that there is no single set of parameters that can be used to accurately 
cluster all the different crops. The effectiveness of either u one-pass c itera- 
tive clustering algorithm to accurately cluster data representative of the Cl flight 
line is questionable. This means that, in order to use cluster analysis in its 
present form for applications like assisting in the definition of field boundaries 
and evaluating the homogeneity of a field, one must have extensive a priori knowledge. 

Modifications to existing techniques, or entirely new techniques, are necessary 
for clustering to be a reliable tool for representative data sets. This involves the 
use of goodness of fit tests to determine, in a quantitative manner, a measure of the 
unimodality of a cluster. This also has applications to quantitatively evaluating 
the homogeneity of test and training fields. 
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Figure 2.- N versus R for the data of figure 1. 
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For each field: 


Letter in middle corresponds to FIELD ID 
Number in lower left corresponds to Class ID 
Number in lower right corresponds to number of pixels 
The agricultural crop is used in simulating C 1 data. 


Figure 3.- Data image (fig. 1 of ref. 2). 
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Figure 4.- Simulated data from the SERID program. 
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Figure 5*- number of clusters versus R for the ideal case. 
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Figure 7.- Intensity of channels 6 and 10 for Cl data. 
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Intensity - channels 6 and 12 for Cl data. 
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Figure 9.- N versus R for simulated Cl data. 
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Figure 10.- Image map from the noniterative clustering technique for R » 5. 


reproducibility OF THE 
ORIGINAL PAGE IS POOR 




090 l 23 **b 6 7 B 70 l 23*<&6 


ttBBBbBdUbBbBdBBdBBB 

bdBabbbbbi>BBUB&bb 38 

.bbtiB3Bt>ue»6BboB3(jBbti 


dbBBdBd dboBBBdddBBB 
ddBH BbriOOdBBbbBbBUB 
bBUBbBBoBBBBb&BuBBB 
utitiBbBBod^bBtJOBb^BB 


BbBBHtSUOt>bBbDBBBBBB 

BbBBbBBBbrtbbBdBBBBB 

Bi>bboUbobr.oi»bQBoBBB_ 

-BUBBOBbBBdbBBBBartbB 


pbBBbBMtiBobbBBBbdBB 


DbBBbBbdbbbbdBBijdBB 

BbBBhbB'BbhBBBBBbHBR, 

abBBBBBBBoBBBBBbBBB 

B 3 UBi>i>Bb(>ubbbBu 3 dBd 

B*iBB»(i| 4 brtbbbBBMb»BB 


ilbBBoBbbDoobBBBbdbB 

JHHUoUBBbobBbribubBB 

0BBBbciboBoBb3t}Bbn3B 

ddii«nbflrtd:-tdBHH&dBBB 

"dbBHtiBbdbbbbbbBbBBB 

i> 33 b.Utr 4 dH!jB 3 HltB 3 BBB 

dbHBiibUbBbBBBQBBBBB 

IdObdBttbdbBjb.lbbBbBBB 

lbbd‘iBoBbBBt>bbbHHbBBH 


kubt)BUBbt)ut>bBbbBBr)BbB 

tbbdodobUOd/bbbdBriBHH 

kbdbBB.idobbiBbbBBbbbO 


erative clustering technique for R 


* 18. 



20 


OOOOODOOOOOQOOOOOOOOOOOQOOOOOOOOQOOQ 
0000000001 111 1 1 1 1 1122222222223333333 
1239B67B901239&67B9q| 239&6789Q|239&6 


1 

2 

3 

*» 

5 

6 

7 

8 
9 

10 

11 

12 

13 

19 

IB 

16 

17 

18 

19 

20 
21 
22 
23 
29 
28 
26 

27 

28 

29 

30 

31 

32 

33 


• 8 • 8 • 


• • • • • 
• #A # • 


• • • • • 


R VV »« 


• •• >♦♦♦♦»••»««- 

• • • 44944 »«•«•«- *•«•-«•••••• 

• #»•(■#■ ••*•*•••*■•( 
• • • *♦♦♦< »•*! 

• • • >♦*# 4 !*••• 

• • • ***** »ff •«N8« 

• • • ►©♦♦4 n««6*»n 

• •• ***** NNttWtNfl- 
A Ay A A A A A A A A A ***** .»«**.. 

A A A A A ABA A A A A ***** 

AAAAAAAAAAAA i**** -M-WINWAAAINBB 

AAAAAAAAAAAA 44444 WiW*NWN •b 4» Bid 6 ft A WWW 
AWAAAnAAAAAA 44444 .«B»tt*N H6 -|wBBBBBBBB 
AWAAAAAAAA'tt 44^4 4»«IWW*-- |»-4*88Btt«AMB 
AAAAW.AAA V AA 44y4d3ttH(i888b6p HBB«BBABB 
AAAAAA. • A AAA 44 4 « BBBBBBBBBB ‘:BBABBBBB 
AAA*AAAAA*6A444»4 bbbbbbbbbb ibbmmbbbb 
AAaaaa»AAAAA 44»4» BBBBBBBBBBbiBBWBBBB 
• 000 ,00*0 *00 4 4 4 44 BBBBBBBBBB NBBBMB6BB 
00900000900# 44444 BBBBBBBBBB R«*n6«6«A 
00900000*000 44444 BBBBBBBBBB K««««0AA6 
00909# 000096 o»4044 BBBBBBBBBB AWftftNMAWA 
00090 4)00000 t 44 44 4 BBBBBBBBBB »*«»*#*#« 
A 09000090004 44444 BBBBBBBBBB Ni»ft«ttAWA 
• 0004 (004*000 4444, fiBBBBBBBBB MAftttWiAA* 
W0009900 •* ,B 44444 BBBBBBBBBB I W« «#»»««# 
I40,9» 9999 0PI( 44444 aBBBBBBBBri M«nMtt66*W 
TPL.Jaaaaaaaaaaaaaaaa 4WM«»n«wA 
1BBB 00* AAAAAAAAAAAA A A A A »»«A****« 

IBB6 00S HA, AAAAAAAAA, A^A AWA66WWN6 

1BB0 44W AAAAAAAAAAAAAA** 
IBB&OGIAAAAAAftAAAAAAWA'PBAWftllWW* 


Figure 12.- ISODATA results (channels 6, 10, 12; 
NM1N = 30; no. of iterations = 20; with chaining. 
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Figure 13.- ISODATA results (channels 6, 10, 12; 
JYMIN = 15; no. of iterations = 20; with chaining 
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Figure 14.- ISODATA results (channels 6, 10, 12; 
NMIN * 30; no. of iterations = 20; no chaining 
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Figure 15.- ISODATA results (channels 6, 10, 12; 
IIMIN = 15; no. of iterations = 20; no chaining 
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Figure 16.- ISODATA results ^har.' . -> j-2; 

NMIN = 30; no. of iterationc * • .ng 
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Figure 17.- ISODATA results (channels 1, 6, 9, 12; 
NMIN * 30; no. of iterations * 19; no 'haining 
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Figure 19.- ISODATA results (channels 1, 6, 9, 12; 
NMIN = 30; no. of iterations = 20; no chaining 




Figure 19.- Kolfflogorov-SmiroorB- statistic. 
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